Name | Version | Summary | date |
benchwise |
0.1.0a1 |
The GitHub of LLM Evaluation - Python SDK |
2025-07-08 10:16:01 |
actbench |
0.0.1a5 |
A framework for evaluating web automation agents and LAM systems. |
2025-02-27 23:49:24 |
examinationrag |
0.1.4 |
XRAG: eXamining the Core - Benchmarking Foundational Component Modules in Advanced Retrieval-Augmented Generation |
2025-02-07 04:58:08 |
airflow-parse-bench |
1.0.1 |
Easily measure and compare your Airflow DAGs' parse time. |
2025-01-26 03:39:23 |
nxbench |
0.1.24 |
A centralized benchmarking suite to facilitate comparative profiling of tools across graph analytic libraries and datasets |
2024-12-28 08:14:36 |
mqt.bench |
1.1.9 |
MQT Bench - A MQT tool for Benchmarking Quantum Software Tools |
2024-12-01 16:56:06 |
jax-hpc-profiler |
0.2.9 |
HPC Plotter and profiler for benchmarking data made for JAX |
2024-11-29 19:38:37 |
cmdbench |
0.1.22 |
Quick and easy benchmarking for any command's CPU, memory, disk usage and runtime. |
2024-11-20 06:53:26 |
multi-comp-matrix |
0.0.2 |
Multi Comparison Matrix: A long term approach to benchmark evaluations |
2024-11-04 17:02:29 |
flow-judge |
0.1.2 |
A small yet powerful LM Judge |
2024-10-29 07:32:52 |
miRBench |
1.0.0 |
A collection of datasets and predictors for benchmarking miRNA target site prediction algorithms |
2024-10-15 11:37:58 |
posebench |
0.5.0 |
Comprehensive benchmarking of protein-ligand structure generation methods |
2024-09-30 16:22:19 |
mlos-viz |
0.6.1 |
Visualization Python interface for benchmark automation and optimization results. |
2024-08-16 18:15:54 |
mlos-bench |
0.6.1 |
MLOS Bench Python interface for benchmark automation and optimization. |
2024-08-16 18:15:52 |
pysniffer |
0.2.0 |
A Python package for profiling and measuring function performance. |
2024-08-16 17:46:15 |
parsbench |
0.1.7 |
ParsBench provides toolkits for benchmarking LLMs based on the Persian language tasks. |
2024-08-15 11:36:10 |
geoconv |
0.0.7 |
Intrinsic Surface Convolutions for everyone! |
2024-08-02 08:08:01 |
nett-benchmarks |
0.4.1 |
A testbed for comparing the learning abilities of newborn animals and autonomous artificial agents. |
2024-07-30 14:54:57 |
mnt.bench |
0.2.14 |
MNT Bench - An MNT tool for Benchmarking FCN circuits |
2024-07-25 13:52:57 |
lips-benchmark |
0.2.7 |
LIPS : Learning Industrial Physical Simulation benchmark suite |
2024-07-22 15:44:49 |